AI-Driven Performance Modeling for AI Inference Workloads

نویسندگان

چکیده

Deep Learning (DL) is moving towards deploying workloads not only in cloud datacenters, but also to the local devices. Although these are mostly limited inference tasks, it still widens range of possible target architectures significantly. Additionally, new targets usually come with drastically reduced computation performance and memory sizes compared traditionally used architectures—and put key optimization focus on efficiency as they often depend batteries. To help developers quickly estimate a neural network during its design phase, models could be used. However, expensive implement require in-depth knowledge about hardware architecture algorithms. AI-based solutions exist, either large datasets that difficult collect low-performance and/or small number platforms metrics. Our solution exploits block-based structure networks, well high similarity typically layer configurations across enabling training accurate significantly smaller datasets. In addition, our specific or metric. We showcase feasibility set seven devices from four different architectures, up three metrics per target—including power consumption footprint. tests have shown achieved an error less than 1 ms (2.6%) latency, 0.12 J (4%) energy 11 MiB (1.5%) allocation for whole prediction, while being five orders magnitude faster benchmark.

برای دانلود باید عضویت طلایی داشته باشید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Compiling AI Engineering Models for Probabilistic Inference

In engineering domains, AI decision making is often confronted with problems that lie at the intersection of logic-based and probabilistic reasoning. A typical example is the plan assessment problem studied in this paper, which comprises the identification of possible faults and the computation of remaining success probabilities based on a system model. In addition, AI solutions to such problem...

متن کامل

Modeling Progress in AI

Participants in recent discussions of AI-related issues ranging from intelligence explosion to technological unemployment have made diverse claims about the nature, pace, and drivers of progress in AI. However, these theories are rarely specified in enough detail to enable systematic evaluation of their assumptions or to extrapolate progress quantitatively, as is often done with some success in...

متن کامل

When Will AI Exceed Human Performance? Evidence from AI Experts

Advances in artificial intelligence (AI) will transform modern life by reshaping transportation, health, science, finance, and the military [1, 2, 3]. To adapt public policy, we need to better anticipate these advances [4, 5]. Here we report the results from a large survey of machine learning researchers on their beliefs about progress in AI. Researchers predict AI will outperform humans in man...

متن کامل

Search and Inference in AI Planning

While Planning has been a key area in Artificial Intelligence since its beginnings, significant changes have occurred in the last decade as a result of new ideas and a more established empirical methodology. In this invited talk, I will focus on Optimal Planning where these new ideas can be understood along two dimensions: branching and pruning. Both heuristic search planners, and SAT and CSP p...

متن کامل

Relative Entropy, Probabilistic Inference, and AI

is an information-theoretic measure of the dissimilarity between q = q 1, • • · ,qn and p = p 11 • • • , pn (H is also called cross-entropy, discrimination information, directed divergence, !-divergence, K-L number, among other terms). Various properties of relative entropy have led to its widespread use in information theory. These properties suggest that relative entropy has a role to play in...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: Electronics

سال: 2022

ISSN: ['2079-9292']

DOI: https://doi.org/10.3390/electronics11152316